AITopics | avg score

Collaborating Authors

avg score

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AutoPBO: LLM-powered Optimization for Local Search PBO Solvers

Li, Jinyuan, Chu, Yi, Sun, Yiwen, Zou, Mengchuan, Cai, Shaowei

arXiv.org Artificial IntelligenceSep-5-2025

Pseudo-Boolean Optimization (PBO) provides a powerful framework for modeling combinatorial problems through pseudo-Boolean (PB) constraints. Local search solvers have shown excellent performance in PBO solving, and their efficiency is highly dependent on their internal heuristics to guide the search. Still, their design often requires significant expert effort and manual tuning in practice. While Large Language Models (LLMs) have demonstrated potential in automating algorithm design, their application to optimizing PBO solvers remains unexplored. In this work, we introduce AutoPBO, a novel LLM-powered framework to automatically enhance PBO local search solvers. We conduct experiments on a broad range of four public benchmarks, including one real-world benchmark, a benchmark from PB competition, an integer linear programming optimization benchmark, and a crafted combinatorial benchmark, to evaluate the performance improvement achieved by AutoPBO and compare it with six state-of-the-art competitors, including two local search PBO solvers NuPBO and OraSLS, two complete PB solvers PBO-IHS and RoundingSat, and two mixed integer programming (MIP) solvers Gurobi and SCIP. Au-toPBO demonstrates significant improvements over previous local search approaches, while maintaining competitive performance compared to state-of-the-art competitors. The results suggest that AutoPBO offers a promising approach to automating local search solver design.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2509.04007

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

M3HG: Multimodal, Multi-scale, and Multi-type Node Heterogeneous Graph for Emotion Cause Triplet Extraction in Conversations

Liang, Qiao, Shen, Ying, Chen, Tiantian, Zhang, Lin

arXiv.org Artificial IntelligenceAug-27-2025

Emotion Cause Triplet Extraction in Multimodal Conversations (MECTEC) has recently gained significant attention in social media analysis, aiming to extract emotion utterances, cause utterances, and emotion categories simultaneously. However, the scarcity of related datasets, with only one published dataset featuring highly uniform dialogue scenarios, hinders model development in this field. To address this, we introduce MECAD, the first multimodal, multi-scenario MECTEC dataset, comprising 989 conversations from 56 TV series spanning a wide range of dialogue contexts. In addition, existing MECTEC methods fail to explicitly model emotional and causal contexts and neglect the fusion of semantic information at different levels, leading to performance degradation. In this paper, we propose M3HG, a novel model that explicitly captures emotional and causal contexts and effectively fuses contextual information at both inter- and intra-utterance levels via a multimodal heterogeneous graph. Extensive experiments demonstrate the effectiveness of M3HG compared with existing state-of-the-art methods. The codes and dataset are available at https://github.com/redifinition/M3HG.

large language model, machine learning, utterance, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2025.findings-acl.596

2508.1874

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

I-MCTS: Enhancing Agentic AutoML via Introspective Monte Carlo Tree Search

Liang, Zujie, Wei, Feng, Xu, Wujiang, Chen, Lin, Qian, Yuxi, Wu, Xinhui

arXiv.org Artificial IntelligenceFeb-20-2025

Recent advancements in large language models (LLMs) have shown remarkable potential in automating machine learning tasks. However, existing LLM-based agents often struggle with low-diversity and suboptimal code generation. While recent work has introduced Monte Carlo Tree Search (MCTS) to address these issues, limitations persist in the quality and diversity of thoughts generated, as well as in the scalar value feedback mechanisms used for node selection. In this study, we introduce Introspective Monte Carlo Tree Search (I-MCTS), a novel approach that iteratively expands tree nodes through an introspective process that meticulously analyzes solutions and results from parent and sibling nodes. This facilitates a continuous refinement of the node in the search tree, thereby enhancing the overall decision-making process. Furthermore, we integrate a Large Language Model (LLM)-based value model to facilitate direct evaluation of each node's solution prior to conducting comprehensive computational rollouts. A hybrid rewarding mechanism is implemented to seamlessly transition the Q-value from LLM-estimated scores to actual performance scores. This allows higher-quality nodes to be traversed earlier. Applied to the various ML tasks, our approach demonstrates a 6% absolute improvement in performance compared to the strong open-source AutoML agents, showcasing its effectiveness in enhancing agentic AutoML systems. Resource available at https://github.com/jokieleung/I-MCTS

avg score, feature -engineered training data, simulated score, (12 more...)

arXiv.org Artificial Intelligence

2502.14693

Genre:

Research Report > Promising Solution (0.66)
Research Report > New Finding (0.48)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning

Chi, Yizhou, Lin, Yizhang, Hong, Sirui, Pan, Duyi, Fei, Yaying, Mei, Guanghao, Liu, Bangbang, Pang, Tianqi, Kwok, Jacky, Zhang, Ceyao, Liu, Bang, Wu, Chenglin

arXiv.org Artificial IntelligenceOct-22-2024

Automated Machine Learning (AutoML) approaches encompass traditional methods that optimize fixed pipelines for model selection and ensembling, as well as newer LLM-based frameworks that autonomously build pipelines. While LLM-based agents have shown promise in automating machine learning tasks, they often generate low-diversity and suboptimal code, even after multiple iterations. To overcome these limitations, we introduce Tree-Search Enhanced LLM Agents (SELA), an innovative agent-based system that leverages Monte Carlo Tree Search (MCTS) to optimize the AutoML process. By representing pipeline configurations as trees, our framework enables agents to conduct experiments intelligently and iteratively refine their strategies, facilitating a more effective exploration of the machine learning solution space. This novel approach allows SELA to discover optimal pathways based on experimental feedback, improving the overall quality of the solutions. In an extensive evaluation across 20 machine learning datasets, we compare the performance of traditional and agent-based AutoML methods, demonstrating that SELA achieves a win rate of 65% to 80% against each baseline across all datasets. Automated Machine Learning (AutoML) is a rapidly evolving field that seeks to automate the process of designing reliable machine learning solutions with minimal human intervention. Traditional AutoML frameworks, such as Auto-WEKA (Thornton et al., 2013), Auto-Sklearn (Feurer et al., 2015; 2020), AutoGluon (Tang et al., 2024b), and H2O AutoML (LeDell & Poirier, 2020), rely on predefined search spaces and routines. These frameworks primarily focus on optimizing hyperparameters and model ensembling to find the best model configuration. However, this fixed and static approach often lacks the adaptability needed to handle diverse and dynamic data scenarios, resulting in suboptimal performance in more complex settings.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.17238

Country:

Asia > China > Hong Kong (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(4 more...)

Genre: Research Report > Promising Solution (0.48)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Benchmarking zero-shot stance detection with FlanT5-XXL: Insights from training data, prompting, and decoding strategies into its near-SoTA performance

Aiyappa, Rachith, Senthilmani, Shruthi, An, Jisun, Kwak, Haewoon, Ahn, Yong-Yeol

arXiv.org Artificial IntelligenceFeb-29-2024

Such fine-tuning Stance detection is a fundamental computational approaches can benefit from both the general language task that is widely used across many disciplines understanding from the pre-training as well such as political science and communication studies as the problem-specific thing, even without spending (Wang et al., 2019b; Küçük and Can, 2020) Its a huge amount of computing resources (Wang goal is to extract the standpoint or stance (e.g., Favor, et al., 2022a). Against, or Neutral) towards a target from a More recently, the GPT family of models (Radford given text. Given that modern democratic societies et al., 2019; Brown et al., 2020) birthed another make societal decisions by aggregating people's explicit powerful and even simpler paradigm of incontext stances through voting, estimation of peoples' learning ("few-shot" or "zero-shot"). Instead stances is a useful task. While a representative survey of tuning any parameters of the model, it is the gold standard, it falls short in scalability simply uses the input to guide the model to produce and cost (Salganik, 2019). Surveys can also produce the desired output for downstream tasks. For biased results due to the people's tendency to instance, a few examples related to the task can be report more socially acceptable positions even in fed as the context to the LLM.

dataset, semeval 2016, stance detection, (15 more...)

arXiv.org Artificial Intelligence

2403.00236

Country:

Europe > Spain > Catalonia (0.04)
North America > United States > Indiana (0.04)
Europe > United Kingdom > Wales (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (0.94)
Research Report > New Finding (0.93)

Industry: Government > Regional Government > North America Government > United States Government (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning to Skip for Language Modeling

Zeng, Dewen, Du, Nan, Wang, Tao, Xu, Yuanzhong, Lei, Tao, Chen, Zhifeng, Cui, Claire

arXiv.org Artificial IntelligenceNov-26-2023

Overparameterized large-scale language models have impressive generalization performance of in-context few-shot learning. However, most language models allocate the same amount of parameters or computation to each token, disregarding the complexity or importance of the input data. We argue that in language model pretraining, a variable amount of computation should be assigned to different tokens, and this can be efficiently achieved via a simple routing mechanism. Different from conventional early stopping techniques where tokens can early exit at only early layers, we propose a more general method that dynamically skips the execution of a layer (or module) for any input token with a binary router. In our extensive evaluation across 24 NLP tasks, we demonstrate that the proposed method can significantly improve the 1-shot performance compared to other competitive baselines only at mild extra cost for inference.

baseline, computation, skiplayer, (14 more...)

arXiv.org Artificial Intelligence

2311.15436

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > France (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback